Lexicon Acquisition with a large-coverage unification-based grammar

نویسنده

  • Frederik Fouvry
چکیده

We describe how unknown lexical entries are processed in a unification-based framework with large-coverage grammars and how from their usage lexical entries are extracted. To keep the time and space usage during parsing within bounds, information from external sources like Part of Speech (PoS) taggers and morphological analysers is taken into account when information is constructed for unknown words.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating Probabilistic and Knowledge-based Approaches to Corpus Parsing

We have developed a prototype system for syntactic parsing of corpus text based on a wide-coverage unification-based grammar of English and domain-independent statistical techniques for selecting the most plausible parses from the typically large number licensed by the grammar. Although the results from initial experiments are promising, the system is ‘brittle’, relying particularly on the corr...

متن کامل

Lexicon Acquisition with and for Symbolic NLP-Systems – a Bootstrapping Approach

We present a method of applying a broad-coverage LFG grammar of German in the process of semi-automatic lexicon acquisition from corpora. The identification of corpus instances that illustrate a certain subcategorization frame uniquely is done by a comparison of the numbers of analyses the grammar assigns to the corpus instances, under the assumption of different hypothetical lexicon entries fo...

متن کامل

The Automatic Acquisition of Verb Subcategorisations and Their Impact on the Performance of an HPSG Parser

We describe the automatic acquisition of a lexicon of verb subcategorisations from a domain-specific corpus, and an evaluation of the impact this lexicon has on the performance of a “deep”, HPSG parser of English. We conducted two experiments to determine whether the empirically extracted verb stems would enhance the lexical coverage of the grammar and to see whether the automatically extracted...

متن کامل

Automatically Extending the Lexicon for Parsing

This paper describes a method for automatically extending the lexicon of wide-coverage parsers. The method is an extension to the automatic detection of coverage problems of natural language parsers, based on large amounts of raw text (van Noord 2004). The goal is to extend grammar coverage, focusing in particular on the acquisition of lexical information for missing and incomplete lexicon entr...

متن کامل

D-PATR: A Development Environment for Unification-Based Grammars

I)-PATR is a development environment for unification-based grammars on Xerox l i00 series work stations. It is based on the PATR formalism developed at SRI International. This formalism is suitable for encoding a wide variety of grammars. At one end of this range are simple phrase-structure grammars with no feature augmentations. The PATR formalism can also be used to encode grammars that are b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003